CrowdAidRepair: A Crowd-Aided Interactive Data Repairing Method

نویسندگان

  • Jian Zhou
  • Zhixu Li
  • Binbin Gu
  • Qing Xie
  • Jia Zhu
  • Xiangliang Zhang
  • Guoliang Li
چکیده

Data repairing aims at discovering and correcting erroneous data in databases. Traditional methods relying on predefined quality rules to detect the conflict between data may fail to choose the right way to fix the detected conflict. Recent efforts turn to use the power of crowd in data repairing, but the crowd power has its own drawbacks such as high human intervention cost and inevitable low efficiency. In this paper, we propose a crowd-aided interactive data repairing method which takes the advantages of both rule-based method and crowd-based method. Particularly, we investigate the interaction between crowd-based repairing and rule-based repairing, and show that by doing crowd-based repairing to a small portion of values, we can greatly improve the repairing quality of the rule-based repairing method. Although we prove that the optimal interaction scheme using the least number of values for crowd-based repairing to maximize the imputation recall is not feasible to be achieved, still, our proposed solution identifies an efficient scheme through investigating the inconsistencies and the dependencies between values in the repairing process. Our empirical study on three data collections demonstrates the high repairing quality of CrowdAidRepair, as well as the efficiency of the generated interaction scheme over baselines.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Data-driven Method for Crowd Simulation using a Holonification Model

In this paper, we present a data-driven method for crowd simulation with holonification model. With this extra module, the accuracy of simulation will increase and it generates more realistic behaviors of agents. First, we show how to use the concept of holon in crowd simulation and how effective it is. For this reason, we use simple rules for holonification. Using real-world data, we model the...

متن کامل

A simulation as a service methodology with application for crowd modeling, simulation and visualization

Crowd modeling and simulation (M&S) has been used to support the analysis of the behavior of crowds, in order to predict the impact of pedestrian movement and to test design alternatives. In recent years, crowd M&S has become more complex, and new technologies such as CAD (computer-aided design) and BIM (building information modeling) authoring tools are being used to support the process. There...

متن کامل

Centroidal particles for interactive crowd simulation

Real-time crowd simulation is a challenging task that demands a careful consideration of the classic trade-off between accuracy and efficiency. Existing particle-based methods have seen success in simulating crowd scenarios for various applications in the architecture, military, urban planning, robotics, and entertainment (film and gaming) industries. In this paper we focus on local dynamics an...

متن کامل

Interactive and adaptive data-driven crowd simulation: User study

We present an adaptive data-driven algorithm for interactive crowd simulation. Our approach combines realistic trajectory behaviors extracted from videos with synthetic multi-agent algorithms to generate plausible simulations. We use statistical techniques to compute the movement patterns and motion dynamics from noisy 2D trajectories extracted from crowd videos. These learned pedestrian dynami...

متن کامل

Repairing CAD model errors based on the design history

For users of CAD data, few things are as frustrating as receiving unusable, poor quality data. Users often waste time fixing or rebuilding such data from scratch on the basis of paper drawings. While previous studies use the boundary representation (BRep) of CAD models, we propose an approach to repairing CAD model errors that is based on the design history. CAD model errors can be corrected by...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016